Generalization Bounds for Randomized Learning with Application to Stochastic Gradient Descent
نویسنده
چکیده
Randomized algorithms are central to modern machine learning. In the presence of massive datasets, researchers often turn to stochastic optimization to solve learning problems. Of particular interest is stochastic gradient descent (SGD), a first-order method that approximates the learning objective and gradient by a random point estimate. A classical question in learning theory is, if a randomized learner has access to a finite training sample, will the resulting learned model generalize to the data’s generating distribution?
منابع مشابه
A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent
We analyze the generalization error of randomized learning algorithms—focusingon stochastic gradient descent (SGD)—using a novel combination of PAC-Bayesand algorithmic stability. Importantly, our risk bounds hold for all posterior dis-tributions on the algorithm’s random hyperparameters, including distributions thatdepend on the training data. This inspires an adaptive sampling...
متن کاملStability and Generalization of Learning Algorithms that Converge to Global Optima
We establish novel generalization bounds for learning algorithms that converge to global minima. We do so by deriving black-box stability results that only depend on the convergence of a learning algorithm and the geometry around the minimizers of the loss function. The results are shown for nonconvex loss functions satisfying the Polyak-Łojasiewicz (PL) and the quadratic growth (QG) conditions...
متن کاملGeneralization Error Bounds with Probabilistic Guarantee for SGD in Nonconvex Optimization
The success of deep learning has led to a rising interest in the generalization property of the stochastic gradient descent (SGD) method, and stability is one popular approach to study it. Existing works based on stability have studied nonconvex loss functions, but only considered the generalization error of the SGD in expectation. In this paper, we establish various generalization error bounds...
متن کاملData-Dependent Stability of Stochastic Gradient Descent
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD) and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worstcase constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non...
متن کاملStability and Convergence Trade-off of Iterative Optimization Algorithms
The overall performance or expected excess risk of an iterative machine learning algorithm can be decomposed into training error and generalization error. While the former is controlled by its convergence analysis, the latter can be tightly handled by algorithmic stability (Bousquet and Elisseeff, 2002). The machine learning community has a rich history investigating convergence and stability s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016